Speaker clustering and transformation for speaker adaptation in speech recognition systems
نویسندگان
چکیده
A speaker adaptation strategy is described that is based on finding a subset of speakers, from the training set, who are acoustically close to the test speaker, and using only the data from these speakers (rather than the complete training corpus) to reestimate the system parameters. Further, a linear transformation is computed for every one of the selected training speakers to better map the training speaker’s data to the test speaker’s acoustic space. Finally, the system parameters (Gaussian means) are reestimated specifically for the test speaker using the transformed data from the selected training speakers. Experiments showed that this scheme is capable of providing an 18% relative improvement in the error rate on a large-vocabulary task with the use of as little as three sentences of adaptation data.
منابع مشابه
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملStudies in transformation-based adaptation
This paper studies the use of transformation-based speaker adaptation in improving the performance of large vocabulary continuous speech recognition systems. We present a formulation of the adaptation procedure that is simpler than existing methods. Our experiments demonstrate that speaker normalization continues to be important even after signi cant amounts of speaker adaptation. An automatic ...
متن کاملThe use of speaker correlation information for automatic speech recognition
This dissertation addresses the independence of observations assumption which is typically made by today’s automatic speech recognition systems. This assumption ignores within-speaker correlations which are known to exist. The assumption clearly damages the recognition ability of standard speaker independent systems, as can seen by the severe drop in performance exhibited by systems between the...
متن کاملACOUSTIC MODEL ADAPTATION FOR AUTOMATIC SPEECH RECOGNITION AND ANIMAL VOCALIZATION CLASSIFICATION by
ACOUSTIC MODEL ADAPTATION FOR AUTOMATIC SPEECH RECOGNITION AND ANIMAL VOCALIZATION CLASSIFICATION Jidong Tao, B.Eng., M.S. Marquette University, 2009 Automatic speech recognition (ASR) converts human speech to readable text. Acoustic model adaptation, also called speaker adaptation, is one of the most promising techniques in ASR for improving recognition accuracy. Adaptation works by tuning a g...
متن کاملSubword-dependent speaker clustering for improved speech recognition
Speaker variability has a significant impact to the state-of-the-art speech recognition systems. Traditionally speaker clustering is performed without considering individual or class phonetic similarities across different speakers. In fact, clustered speaker groups may have very different degrees of variations for different phonetic classes. In this paper, speaker clustering is performed at sub...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Speech and Audio Processing
دوره 6 شماره
صفحات -
تاریخ انتشار 1998